performance expectancy
From Expectation to Habit: Why Do Software Practitioners Adopt Fairness Toolkits?
Voria, Gianmario, Lambiase, Stefano, Schiavone, Maria Concetta, Catolino, Gemma, Palomba, Fabio
As the adoption of machine learning (ML) systems continues to grow across industries, concerns about fairness and bias in these systems have taken center stage. Fairness toolkits, designed to mitigate bias in ML models, serve as critical tools for addressing these ethical concerns. However, their adoption in the context of software development remains underexplored, especially regarding the cognitive and behavioral factors driving their usage. As a deeper understanding of these factors could be pivotal in refining tool designs and promoting broader adoption, this study investigates the factors influencing the adoption of fairness toolkits from an individual perspective. Guided by the Unified Theory of Acceptance and Use of Technology (UTAUT2), we examined the factors shaping the intention to adopt and actual use of fairness toolkits. Specifically, we employed Partial Least Squares Structural Equation Modeling (PLS-SEM) to analyze data from a survey study involving practitioners in the software industry. Our findings reveal that performance expectancy and habit are the primary drivers of fairness toolkit adoption. These insights suggest that by emphasizing the effectiveness of these tools in mitigating bias and fostering habitual use, organizations can encourage wider adoption. Practical recommendations include improving toolkit usability, integrating bias mitigation processes into routine development workflows, and providing ongoing support to ensure professionals see clear benefits from regular use.
- Europe > Italy (0.05)
- Oceania > Australia (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (12 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (1.00)
- Research Report > Experimental Study (0.93)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
More than Marketing? On the Information Value of AI Benchmarks for Practitioners
Hardy, Amelia, Reuel, Anka, Meimandi, Kiana Jafari, Soder, Lisa, Griffith, Allie, Asmar, Dylan M., Koyejo, Sanmi, Bernstein, Michael S., Kochenderfer, Mykel J.
Public AI benchmark results are widely broadcast by model developers as indicators of model quality within a growing and competitive market. However, these advertised scores do not necessarily reflect the traits of interest to those who will ultimately apply AI models. In this paper, we seek to understand if and how AI benchmarks are used to inform decision-making. Based on the analyses of interviews with 19 individuals who have used, or decided against using, benchmarks in their day-to-day work, we find that across these settings, participants use benchmarks as a signal of relative performance difference between models. However, whether this signal was considered a definitive sign of model superiority, sufficient for downstream decisions, varied. In academia, public benchmarks were generally viewed as suitable measures for capturing research progress. By contrast, in both product and policy, benchmarks -- even those developed internally for specific tasks -- were often found to be inadequate for informing substantive decisions. Of the benchmarks deemed unsatisfactory, respondents reported that their goals were neither well-defined nor reflective of real-world use. Based on the study results, we conclude that effective benchmarks should provide meaningful, real-world evaluations, incorporate domain expertise, and maintain transparency in scope and goals. They must capture diverse, task-relevant capabilities, be challenging enough to avoid quick saturation, and account for trade-offs in model performance rather than relying on a single score. Additionally, proprietary data collection and contamination prevention are critical for producing reliable and actionable results. By adhering to these criteria, benchmarks can move beyond mere marketing tricks into robust evaluative frameworks.
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- North America > United States > California > Santa Clara County > Stanford (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Questionnaire & Opinion Survey (1.00)
- Personal > Interview (1.00)
- Research Report > New Finding (0.93)
Scaling Technology Acceptance Analysis with Large Language Model (LLM) Annotation Systems
Smolinski, Pawel Robert, Januszewicz, Joseph, Winiarski, Jacek
Technology acceptance models effectively predict how users will adopt new technology products. Traditional surveys, often expensive and cumbersome, are commonly used for this assessment. As an alternative to surveys, we explore the use of large language models for annotating online user-generated content, like digital reviews and comments. Our research involved designing an LLM annotation system that transform reviews into structured data based on the Unified Theory of Acceptance and Use of Technology model. We conducted two studies to validate the consistency and accuracy of the annotations. Results showed moderate-to-strong consistency of LLM annotation systems, improving further by lowering the model temperature. LLM annotations achieved close agreement with human expert annotations and outperformed the agreement between experts for UTAUT variables. These results suggest that LLMs can be an effective tool for analyzing user sentiment, offering a practical alternative to traditional survey methods and enabling deeper insights into technology design and adoption.
- Europe > Poland > Pomerania Province > Gdańsk (0.04)
- North America > United States > New Hampshire > Grafton County > Hanover (0.04)
- Asia > Middle East > Jordan (0.04)
Factors Influencing User Willingness To Use SORA
Mvondo, Gustave Florentin Nkoulou, Niu, Ben
Sora promises to redefine the way visual content is created. Despite its numerous forecasted benefits, the drivers of user willingness to use the text-to-video (T2V) model are unknown. This study extends the extended unified theory of acceptance and use of technology (UTAUT2) with perceived realism and novelty value. Using a purposive sampling method, we collected data from 940 respondents in the US and analyzed the sample using covariance-based structural equation modeling and fuzzy set qualitative comparative analysis (fsQCA). The findings reveal that all hypothesized relationships are supported, with perceived realism emerging as the most influential driver, followed by novelty value. Moreover, fsQCA identifies five configurations leading to high and low willingness to use, and the model demonstrates high predictive validity, contributing to theory advancement. Our study provides valuable insights for developers and marketers, offering guidance for strategic decisions to promote the widespread adoption of T2V models.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > Middle East > Kuwait (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.66)
- Health & Medicine (0.47)
- Transportation (0.46)
The Impact of Performance Expectancy, Workload, Risk, and Satisfaction on Trust in ChatGPT: Cross-sectional Survey Analysis
Shamszare, Hamid, Choudhury, Avishek
This study investigated how perceived workload, satisfaction, performance expectancy, and risk-benefit perception influenced users' trust in Chat Generative Pre-Trained Transformer (ChatGPT). We aimed to understand the nuances of user engagement and provide insights to improve future design and adoption strategies for similar technologies. A semi-structured, web-based survey was conducted among adults in the United States who actively use ChatGPT at least once a month. The survey was conducted from 22nd February 2023 through 24th March 2023. We used structural equation modeling to understand the relationships among the constructs of perceived workload, satisfaction, performance expectancy, risk-benefit, and trust. The analysis of 607 survey responses revealed a significant negative relationship between perceived workload and user satisfaction, a negative but insignificant relationship between perceived workload and trust, and a positive relationship between user satisfaction and trust. Trust was also found to increase with performance expectancy. In contrast, the relationship between the benefit-to-risk ratio of using ChatGPT and trust was insignificant. The findings underscore the importance of ensuring user-friendly design and functionality in AI-based applications to reduce workload and enhance user satisfaction, thereby increasing user trust. Future research should further explore the relationship between the benefit-to-risk ratio and trust in the context of AI chatbots.
A priori acceptance of highly automated cars in Australia, France, and Sweden: A theoretically-informed investigation guided by the TPB and UTAUT
Applied TPB and UTAUT to assess a priori acceptance of highly automated cars. Drivers residing in France reported greater intentions to use highly automated cars in the future. More research is required to further assess the feasibility of the TPB and UTAUT to assess intentions to use AVs. To assess and explain finely drivers’ a priori acceptance of highly automated cars, this study used the Theory of Planned Behaviour (TPB) and the Unified Theory of Acceptance and Use of Technology (UTAUT). Further, the current study sought to extend upon previous research to assess if intentions to use highly automated cars in the future differed according to country (i.e., Australia, France, & Sweden).
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology > Robotics & Automation (1.00)